07. Rover Sample

Rover Sample

The next environment that you will get to explore and work with uses the C++ API and Gazebo.

The repository contains a gazebo-rover.world file that defines the environment with four main components:

  • The rover.
  • A camera sensor, to capture images to feed into the DQN.
  • A maze.
  • Obstacles that block the rover's path.

## Running the Rover

To test the rover sample, open the desktop in the Udacity Workspace, open a terminal, and once again navigate to the folder containing the samples with:

$ cd /home/workspace/jetson-reinforcement/build/x86_64/bin

Launch the executable from the terminal:

$ ./gazebo-rover.sh

## More about the Rover

The robotic rover model, found in the gazebo-rover.world file, calls upon a gazebo plugin called RoverPlugin. This plugin is responsible for creating the DQN agent and training it to learn. The gazebo plugin shared object file, libgazeboRoverPlugin.so, attached to the robot model in gazebo-rover.world, is responsible for integrating the simulation environment with the RL agent. The plugin is defined in the RoverPlugin.cpp file, also located in the gazebo/ folder.

The RoverPlugin.cpp file takes advantage of the C++ API covered earlier. This plugin creates specific constructor and member functions for the RoverPlugin class defined in RoverPlugin.h. Some of the important methods are discussed below:

### RoverPlugin::Load()

This function is responsible for creating and initializing nodes that subscribe to two specific topics - one for the camera, and one for the collision sensor. For each of the two subscribers, there is a callback function defined in the file:

  • RoverPlugin::onCameraMsg() - This is the callback function for the camera subscriber. It takes the message from the camera topic, extracts the image, and saves it. This is then passed to the DQN.
  • RoverPlugin::onCollisionMsg() - This is the callback function for the collision sensor. This function is used to test whether the collision sensor, defined in gazebo-rover.world, observes a collision with another element/model or not. Furthermore, this callback function can also be used to define a reward function based on whether there has been a collision or not.

In gazebo, subscribing to a topic has the following structure:

gazebo::transport::SubscriberPtr sub = node->Subscribe("topic_name", callback_function, class_instance);

Where,

  • callback_function is the method that’s called when a new message is received, and
  • class_instance is the instance used when a new message is received.

You can refer to the documentation for more details on the above:

### RoverPlugin::createAgent()

Previously, for the fruit and catch samples, you created a DQN agent. The createAgent() class function serves the same purpose wherein you can create and initialize the agent. In RoverPlugin.cpp, the various parameters that are passed to the Create() function for the agent are defined at the top of the file. Some of them are:

// Define DQN API Settings
#define INPUT_WIDTH   64
#define INPUT_HEIGHT  64
#define INPUT_CHANNELS 3
#define OPTIMIZER "RMSprop"
#define LEARNING_RATE 0.1f
#define REPLAY_MEMORY 10000
#define BATCH_SIZE 32
#define GAMMA 0.9f
#define EPS_START 0.9f
#define EPS_END 0.05f
#define EPS_DECAY 200
#define USE_LSTM true
#define LSTM_SIZE 256
#define ALLOW_RANDOM true
#define DEBUG_DQN false

### RoverPlugin::updateAgent()

For every frame that the camera receives, the agent needs to take an appropriate action.

The network selects one output for every frame. This output (action value) can then be mapped to a specific action. The updateAgent() method receives the action value from the DQN, and decides to take that action.

There are four possible ways to control the rover:

  • drive in reverse
  • drive forward
  • turn left
  • turn right

### RoverPlugin::OnUpdate()

This method is primarily utilized to issue rewards and train the DQN. It is called upon at every simulation iteration and can be used to update the robot positions, issue end of episode (EOE) rewards, or issue interim rewards based on the desired goal.

At EOE, various parameters for the API and the plugin are reset, and the current accuracy of the agent performing the appropriate task is displayed on the terminal.